Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

group: refactor MPIR_Group #7235

Merged
merged 10 commits into from
Feb 11, 2025
Merged

group: refactor MPIR_Group #7235

merged 10 commits into from
Feb 11, 2025

Conversation

hzhou
Copy link
Contributor

@hzhou hzhou commented Dec 10, 2024

Pull Request Description

Hide the internal fields of MPIR_Group from unnecessary access.

Outside group_util.c and group_impl.c, it only need assume the MPIR_Lpid
integer type, creation routines based on lpid map or lpid stride
description, and access routine to look up lpid from a group rank.

Add feature to use stride to describe group composition

Remove the linked list design in MPIR_Group_pmap_t

[skip warnings]

Plan

  1. Refactor MPIR_Group so it can be memory-efficient (strided rank map) -- this PR
  2. Always create a communicator from MPIR_Group rather than the other way around
    • This requires lpid to be device independent, and device layer perform address exchange upon communicator creation.
    • Initially, we can require the first non-trivial communicator always to be MPI_COMM_WORLD.
  3. Extend MPIR_Group to represent MPIR_Pset

Author Checklist

  • Provide Description
    Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • Commits Follow Good Practice
    Commits are self-contained and do not do two things at once.
    Commit message is of the form: module: short description
    Commit message explains what's in the commit.
  • Passes All Tests
    Whitespace checker. Warnings test. Additional tests via comments.
  • Contribution Agreement
    For non-Argonne authors, check contribution agreement.
    If necessary, request an explicit comment from your companies PR approval manager.

@hzhou hzhou force-pushed the 2412_group branch 4 times, most recently from 576d5c7 to 0794b2f Compare December 11, 2024 14:23
@hzhou
Copy link
Contributor Author

hzhou commented Dec 11, 2024

test:mpich/ch3/most
test:mpich/ch4/most
All ✔️ other than 2 node crashes

@hzhou hzhou requested a review from yfguo December 11, 2024 19:56
@hzhou
Copy link
Contributor Author

hzhou commented Dec 12, 2024

test:mpich/ch3/most

Copy link
Contributor

@yfguo yfguo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good overall. Need some attend w.r.t the reusing of map across groups.

newgrp->rank = rank;
MPIR_Group_set_session_ptr(newgrp, session_ptr);
newgrp->pmap.use_map = true;
newgrp->pmap.u.map = map;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this shallow copy of map safe? Can MPICH code free a map before all groups using it are freed? We probably should refcount the map to safely reuse it across multiple MPIR_Group.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The map is owned by the group. The map is never shared. On the other hand, the MPIR_Group can be shared and it is reference counted.

@hzhou hzhou force-pushed the 2412_group branch 2 times, most recently from 1535d80 to 351cd7f Compare February 7, 2025 22:44
@hzhou
Copy link
Contributor Author

hzhou commented Feb 7, 2025

test:mpich/ch3/most
test:mpich/ch4/most ✔️

@hzhou
Copy link
Contributor Author

hzhou commented Feb 10, 2025

The last commit is likely missing some dependencies from the next PRs. Please review without the last commit. I'll drop the commit if I cannot resolve the issue in time.

Copy link
Contributor

@raffenet raffenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM up until last commit

@hzhou hzhou force-pushed the 2412_group branch 3 times, most recently from ac5b69e to a78823b Compare February 11, 2025 03:32
@hzhou
Copy link
Contributor Author

hzhou commented Feb 11, 2025

test:mpich/ch3/most
test:mpich/ch4/most

@hzhou
Copy link
Contributor Author

hzhou commented Feb 11, 2025

Fixed the last commit. The culprit was MPIR_Lpid typed as uint64_t. Changing it int64_t made the math (the stride pmap) work.

@hzhou hzhou requested a review from raffenet February 11, 2025 05:06
This test requires to access MPICH internals, thus won't be used with
the current design.
We no longer use this file.
Hide the internal fields of MPIR_Group from unnecessary access.

Outside group_util.c and group_impl.c, it only need assume the MPIR_Lpid
integer type, creation routines based on lpid map or lpid stride
description, and access routine to look up lpid from a group rank.
For most external usages, we only need MPIR_Group_rank_to_lpid.
Avoid access group internal fields.
Group similar functions together to facilitate refactoring.
There is no changes in this commit other than moving functions around.

The 4 incl/excl functions are very similar.

The 3 difference/intersection/union functions are very similar.
Use MPIR_Group_{rank_to_lpid,lpid_to_rank} to avoid directly access
MPIR_Group internal fields.

For most group creation routines, just populate an lpid lookup map and
call MPIR_Group_create_map to create the group.
* add option to use stride to describe group composition
* remove the linked list design
The unsigned type uint64_t is dangerous as we perform pmap stride math.
Signed integer should always work and we can use assertions to check its
range.
* Add check_map_is_strided to detect strided pattern and convert a map into a
strided pmap.

* Move internal static routines to the bottom of grouputil.c.
Copy link
Contributor

@yfguo yfguo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hzhou hzhou merged commit 74357f0 into pmodels:main Feb 11, 2025
4 checks passed
@hzhou hzhou deleted the 2412_group branch February 11, 2025 19:29
@dalcinl
Copy link
Contributor

dalcinl commented Feb 12, 2025

@hzhou I believe this PR broke mpi4py tests:

testDifference (test_group.TestGroupSelf.testDifference) ... /home/runner/work/_temp/e71ecd9b-95ff-4ee1-97e1-da415480ba87.sh: line 1: 162928 Floating point exception(core dumped) python test/main.py -v

Full logs here. I only see useful logs with the runs using gforker, but not hydra. Maybe something else is going on.

@hzhou
Copy link
Contributor Author

hzhou commented Feb 12, 2025

Not sure how it resulted in floating point exception consider we don't use floating point anywhere (other than reduction ops). I see it dumps core. Could you obtain a backtrace of it?

@dalcinl
Copy link
Contributor

dalcinl commented Feb 12, 2025

The UCX build is slightly more informative:

testDifference (test_group.TestGroupSelf.testDifference) ... [fv-az787-562:164480:0:164480] Caught signal 8 (Floating point exception: integer divide by zero)
==== backtrace (tid: 164480) ====
 0  /lib/libucs.so.0(ucs_handle_error+0x2e4) [0x7f47fb7c1a54]
 1  /lib/libucs.so.0(+0x34c4c) [0x7f47fb7c1c4c]
 2  /lib/libucs.so.0(+0x34fda) [0x7f47fb7c1fda]
 3  /lib/x86_64-linux-gnu/libc.so.6(+0x45320) [0x7f47fe645320]
 4  /usr/local/lib/libmpi.so.0(+0x57c720) [0x7f47fbd7c720]
 5  /usr/local/lib/libmpi.so.0(+0x57b86e) [0x7f47fbd7b86e]
 6  /usr/local/lib/libmpi.so.0(+0x578eaa) [0x7f47fbd78eaa]
 7  /usr/local/lib/libmpi.so.0(+0x16cd98) [0x7f47fb96cd98]
 8  /usr/local/lib/libmpi.so.0(PMPI_Group_difference+0x29) [0x7f47fb96ceed]

Full logs here.

@dalcinl
Copy link
Contributor

dalcinl commented Feb 12, 2025

mpi4py tests use group operations involving MPI_GROUP_EMPTY, and I suspect that's the place the 0 is coming from.
The backtrace points to file and line src/mpi/group/grouputil.c:452.

...
    } else {
        lpid -= pmap->u.stride.offset;
        MPIR_Lpid i_blk = lpid / pmap->u.stride.stride;
        MPIR_Lpid r_blk = lpid % pmap->u.stride.stride;
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants